Machine Learning: ECML 2007

chapter

Learning, Information Extraction and the Web

Tom M. Mitchell

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Invited Talks > 1-1

Significant progress has been made recently in semi-supervised learning algorithms that require less labeled training data by utilizing unlabeled data. Much of this progress has been made in the context of natural language analysis (e.g., semi-supervised learning for named entity recognition and for relation extraction). This talk will overview progress in this area, present some of our own recent...

chapter

Putting Things in Order: On the Fundamental Role of Ranking in Classification and Probability Estimation

Peter A. Flach

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Invited Talks > 2-3

While a binary classifier aims to distinguish positives from negatives, a ranker orders instances from high to low expectation that the instance is positive. Most classification models in machine learning output some score of ‘positiveness’, and hence can be used as rankers. Conversely, any ranker can be turned into a classifier if we have some instance-independent means of splitting the ranking into...

chapter

Mining Queries

Ricardo Baeza-Yates

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Invited Talks > 4-4

User queries in search engines and Websites give valuable information on the interests of people. In addition, clicks after queries relate those interests to actual content. Even queries without clicks or answers imply important missing synonyms or content. In this talk we show several examples on how to use this information to improve the performance of search engines, to recommend better queries,...

chapter

Adventures in Personalized Information Access

Barry Smyth

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Invited Talks > 5-5

Access to information plays an increasingly important role in our everyday lives and we have come to rely more and more on a variety of information access services to bring us the right information at the right time. Recently the traditional one-size-fits-all approach, which has informed the development of the majority of today’s information access services, from search engines to portals, has been...

chapter

Statistical Debugging Using Latent Topic Models

David Andrzejewski, Anne Mulhern, Ben Liblit, Xiaojin Zhu

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 6-17

Statistical debugging uses machine learning to model program failures and help identify root causes of bugs. We approach this task using a novel Delta-Latent-Dirichlet-Allocation model. We model execution traces attributed to failed runs of a program as being generated by two types of latent topics: normal usage topics and bug topics. Execution traces attributed to successful runs of the same program,...

chapter

Learning Balls of Strings with Correction Queries

Leonor Becerra Bonache, Colin Higuera, Jean-Christophe Janodet, Frédéric Tantini

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 18-29

During the 80’s, Angluin introduced an active learning paradigm, using an Oracle, capable of answering both membership and equivalence queries. However, practical evidence tends to show that if the former are often available, this is usually not the case of the latter. We propose new queries, called correction queries, which we study in the framework of Grammatical Inference. When a string is submitted...

chapter

Neighborhood-Based Local Sensitivity

Paul N. Bennett

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 30-41

We introduce a nonparametric model for sensitivity estimation which relies on generating points similar to the prediction point using its k nearest neighbors. Unlike most previous work, the sampled points differ simultaneously in multiple dimensions from the prediction point in a manner dependent on the local density. Our approach is based on an intuitive idea of locality which uses the Voronoi cell...

chapter

Approximating Gaussian Processes with ${\cal H}^2$ -Matrices

Steffen Börm, Jochen Garcke

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 42-53

To compute the exact solution of Gaussian process regression one needs $\mathcal{O}(N^3)$ computations for direct and $\mathcal{O}(N^2)$ for iterative methods since it involves a densely populated kernel matrix of size N ×N, here N denotes the number of data. This makes large scale learning problems intractable by standard techniques. We propose to use an alternative approach: the...

chapter

Learning Metrics Between Tree Structured Data: Application to Image Recognition

Laurent Boyer, Amaury Habrard, Marc Sebban

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 54-66

The problem of learning metrics between structured data (strings, trees or graphs) has been the subject of various recent papers. With regard to the specific case of trees, some approaches focused on the learning of edit probabilities required to compute a so-called stochastic tree edit distance. However, to reduce the algorithmic and learning constraints, the deletion and insertion operations are...

chapter

Shrinkage Estimator for Bayesian Network Parameters

John Burge, Terran Lane

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 67-78

Maximum likelihood estimates (MLEs) are commonly used to parameterize Bayesian networks. Unfortunately, these estimates frequently have unacceptably high variance and often overfit the training data. Laplacian correction can be used to smooth the MLEs towards a uniform distribution. However, the uniform distribution may represent an unrealistic relationships in the domain being modeled and can add...

chapter

Level Learning Set: A Novel Classifier Based on Active Contour Models

Xiongcai Cai, Arcot Sowmya

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 79-90

This paper presents a novel machine learning algorithm for pattern classification based on image segmentation and optimisation techniques employed in active contour models and level set methods. The proposed classifier, named level learning set (LLS), has the ability to classify general datasets including sparse and non sparse data. It moves developments in vision segmentation into general machine...

chapter

Learning Partially Observable Markov Models from First Passage Times

Jérôme Callut, Pierre Dupont

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 91-103

We propose a novel approach to learn the structure of Partially Observable Markov Models (POMMs) and to estimate jointly their parameters. POMMs are graphical models equivalent to Hidden Markov Models (HMMs). The model structure is built to support the First Passage Times (FPT) dynamics observed in the training sample. We argue that the FPT in POMMs are closely related to the model structure. Starting...

chapter

Context Sensitive Paraphrasing with a Global Unsupervised Classifier

Michael Connor, Dan Roth

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 104-115

Lexical paraphrasing is an inherently context sensitive problem because a word’s meaning depends on context. Most paraphrasing work finds patterns and templates that can replace other patterns or templates in some context, but we are attempting to make decisions for a specific context. In this paper we develop a global classifier that takes a word v and its context, along with a candidate word u,...

chapter

Dual Strategy Active Learning

Pinar Donmez, Jaime G. Carbonell, Paul N. Bennett

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 116-127

Active Learning methods rely on static strategies for sampling unlabeled point(s). These strategies range from uncertainty sampling and density estimation to multi-factor methods with learn-once-use-always model parameters. This paper proposes a dynamic approach, called DUAL, where the strategy selection parameters are adaptively updated based on estimated future residual error reduction after each...

chapter

Decision Tree Instability and Active Learning

Kenneth Dwyer, Robert Holte

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 128-139

Decision tree learning algorithms produce accurate models that can be interpreted by domain experts. However, these algorithms are known to be unstable – they can produce drastically different hypotheses from training sets that differ just slightly. This instability undermines the objective of extracting knowledge from the trees. In this paper, we study the instability of the C4.5 decision tree learner...

chapter

Constraint Selection by Committee: An Ensemble Approach to Identifying Informative Constraints for Semi-supervised Clustering

Derek Greene, Pádraig Cunningham

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 140-151

A number of clustering algorithms have been proposed for use in tasks where a limited degree of supervision is available. This prior knowledge is frequently provided in the form of pairwise must-link and cannot-link constraints. While the incorporation of pairwise supervision has the potential to improve clustering accuracy, the composition and cardinality of the constraint sets can significantly...

chapter

The Cost of Learning Directed Cuts

Thomas Gärtner, Gemma C. Garriga

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 152-163

In this paper we investigate the problem of classifying vertices of a directed graph according to an unknown directed cut. We first consider the usual setting in which the directed cut is fixed. However, even in this setting learning is not possible without in the worst case needing the labels for the whole vertex set. By considering the size of the minimum path cover as a fixed parameter, we derive...

chapter

Spectral Clustering and Embedding with Hidden Markov Models

Tony Jebara, Yingbo Song, Kapil Thadani

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 164-175

Clustering has recently enjoyed progress via spectral methods which group data using only pairwise affinities and avoid parametric assumptions. While spectral clustering of vector inputs is straightforward, extensions to structured data or time-series data remain less explored. This paper proposes a clustering method for time-series data that couples non-parametric spectral clustering with parametric...

chapter

Probabilistic Explanation Based Learning

Angelika Kimmig, Luc Raedt, Hannu Toivonen

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 176-187

Explanation based learning produces generalized explanations from examples. These explanations are typically built in a deductive manner and they aim to capture the essential characteristics of the examples. Probabilistic explanation based learning extends this idea to probabilistic logic representations, which have recently become popular within the field of statistical relational learning...

chapter

Graph-Based Domain Mapping for Transfer Learning in General Games

Gregory Kuhlmann, Peter Stone

Lecture Notes in Computer Science > Machine Learning: ECML 2007 > Long Papers > 188-200

A general game player is an agent capable of taking as input a description of a game’s rules in a formal language and proceeding to play without any subsequent human input. To do well, an agent should learn from experience with past games and transfer the learned knowledge to new problems. We introduce a graph-based method for identifying previously encountered games and prove its robustness formally...

INFONA - science communication portal

Machine Learning: ECML 2007
18th European Conference on Machine Learning, Warsaw, Poland, September 17-21, 2007. Proceedings

Learning, Information Extraction and the Web

Putting Things in Order: On the Fundamental Role of Ranking in Classification and Probability Estimation

Mining Queries

Adventures in Personalized Information Access

Statistical Debugging Using Latent Topic Models

Learning Balls of Strings with Correction Queries

Neighborhood-Based Local Sensitivity

Approximating Gaussian Processes with ${\cal H}^2$ -Matrices

Learning Metrics Between Tree Structured Data: Application to Image Recognition

Shrinkage Estimator for Bayesian Network Parameters

Level Learning Set: A Novel Classifier Based on Active Contour Models

Learning Partially Observable Markov Models from First Passage Times

Context Sensitive Paraphrasing with a Global Unsupervised Classifier

Dual Strategy Active Learning

Decision Tree Instability and Active Learning

Constraint Selection by Committee: An Ensemble Approach to Identifying Informative Constraints for Semi-supervised Clustering

The Cost of Learning Directed Cuts

Spectral Clustering and Embedding with Hidden Markov Models

Probabilistic Explanation Based Learning

Graph-Based Domain Mapping for Transfer Learning in General Games

Filter options

Publication date

Content availability

Publication language

Keywords

INFONA - science communication portal

Machine Learning: ECML 2007 18th European Conference on Machine Learning, Warsaw, Poland, September 17-21, 2007. Proceedings $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication language

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

Machine Learning: ECML 2007
18th European Conference on Machine Learning, Warsaw, Poland, September 17-21, 2007. Proceedings